0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag
An open-source project offering a functional RAG UI for document QA, suitable for both end-users and developers. It supports various LLM providers, is customizable, and offers multi-modal QA, citations, and complex reasoning methods.
Discussion in r/LocalLLaMA about finding a self-hosted, local RAG (Retrieval Augmented Generation) solution for large language models, allowing users to experiment with different prompts, models, and retrieval rankings. Various tools and resources are suggested, such as Open-WebUI, kotaemon, and tldw.
This article discusses the importance of determining user query intent to enhance search results. It covers how to identify search and answer intents, implement intent detection using language models, and adjust retrieval strategies accordingly.
Researchers from Cornell University developed a technique called 'contextual document embeddings' to improve the performance of Retrieval-Augmented Generation (RAG) systems, enhancing the retrieval of relevant documents by making embedding models more context-aware.
Standard methods like bi-encoders often fail to account for context-specific details, leading to poor performance in application-specific datasets. Contextual document embeddings address this by enhancing the sensitivity of the embedding model to subtle differences in documents, particularly in specialized domains.
The researchers proposed two complementary methods to improve bi-encoders:
These modifications allow the model to capture both the general context and specific details of documents, leading to better performance, especially in out-of-domain scenarios. The new technique has shown consistent improvements over standard bi-encoders and can be adapted for various applications beyond text-based models.
The article discusses the challenges and components required to scale Retrieval Augmented Generation (RAG) from a Proof of Concept (POC) to production. It covers key issues such as performance, data management, risk, integration into workflows, and cost. It also outlines architectural components such as scalable vector databases, caching mechanisms, advanced search techniques, responsible AI layers, and API gateways needed for overcoming these challenges.
This article discusses the importance of real-time access for Retrieval Augmented Generation (RAG) and how Redis can enable this through its real-time vector database, semantic cache, and LLM memory capabilities, leading to faster and more accurate responses in GenAI applications.
Dr. Leon Eversberg explains how to improve the retrieval step in RAG pipelines using the HyDE technique, making LLMs more effective in accessing external knowledge through documents.
Foundational concepts, practical implementation of semantic search, and the workflow of RAG, highlighting its advantages and versatile applications.
The article provides a step-by-step guide to implementing a basic semantic search using TF-IDF and cosine similarity. This includes preprocessing steps, converting text to embeddings, and searching for relevant documents based on query similarity.
This page provides documentation for the rerank API, including endpoints, request parameters, and response formats.
Maximize search relevancy and RAG accuracy with Jina Reranker. Features include multilingual retrieval, code search, and a 6x speedup over the previous version.
First / Previous / Next / Last
/ Page 3 of 0